Removing the trend of drift induced from acceleration noise for LISA 
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In this paper we demonstrate a methodology to remove the power of the drift induced from random 
acceleration on LISA proof mass in the frequency domain. The drift must be cleaned from LISA time 
series data in advance of any further analysis. The cleaning is usually performed in the time domain 
by using a quadratic function to fit the time series data, and then removing the fitted part from the 
data. Having Fourier transformed the residuals, and then convolved with LISA transfer function, 
LISA sensitivity curve can be obtained. However, cosmic gravitational-wave background cannot be 
retrieved with this approach due to its random nature. Here we provide a new representation of 
power spectrum given by discrete Fourier transform, which is applied to find the function of the drift 
power for the cleaning in the frequency domain. We also give the probabihty distribution used to 
analyze the data in the frequency domain. We combine several techniques, including Markov Chain 
Monte Carlo method, simulated annealing, and Gelman & Rubin's method, with Baye's theorem 
to build the algorithm. The algorithm is utilized to analyze 24 simulations of LISA instrumental 
noise. We prove that the LISA sensitivity can be recovered through this approach. It can help us 
to build algorithms for some tasks which are must accomplished in the frequency domain for LISA 
data analysis. This method can be applied to other space-borne interferometers if charges on their 
proof masses cannot be perfectly cancelled. 
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I. INTRODUCTION 

In the LISA data stream, intrinsic instrumental noise 
falls into two categories: shot noise and acceleration noise 
[l|. Acceleration noise, caused by the residual Coulomb 
force induced from the imperfect cancellation of charges 
on proof-masses, is dominant in the lovif-frequency range, 
resulting the sensitivity proportional to roughly be- 
low 2 mHz. Optical-path noise, including mainly shot 
noise and beam-pointing error, is dominant in the high 
frequency range, leading the sensitivity declining propor- 
tional to the frequency above 10 mHz due to the falloff 
of the antenna transfer function. 

The theoretical LISA sensitivity can be obtained from 
the various types of noise spectral densities directly [l|, 
whereas when we deal with LISA time series raw data, 
the trend of the drift of proof mass in the time series due 
to the random acceleration shall be removed at first. In 
general, the removal is performed in the time domain. By 
Fourier transforming the residuals, the LISA sensitivity 
is recovered. However, if stochastic gravitational-wave 
background exists in the data, bias might be induced if 
the trend is firstly removed in the time domain and then 
the background is extracted in the frequency domain. 
The analysis given by such sequential subtraction might 
be inaccurate, particularly if signals have overlaps. For 
instance, if a data set containing two overlapped signals is 
fitted by a linear filter where the signal is parametrized by 
a rectangular function of amplitude and location, the fil- 
ter may extract a stronger output around the overlapped 
region instead of one of the exact signals. This is because 
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what fitting does is to minimize the difference between 
data and model. For that reason, the parameter esti- 
mation and the removal of the trend in the LISA data 
analysis should all be performed either in the time do- 
main or in the frequency domain. 

The cosmological sources in the very early universe are 
randomly distributed across the sky, emitting gravita- 
tional waves with various amplitude and frequency. If 
they are not strong enough to be located by LISA, their 
incoherent signals will form a continuum and be entan- 
gled with instrumental noise. In order to gain cosmo- 
logical information, the functional form of the trend and 
cosmic gravitational-wave background (CGB) are both 
needed to be understood. 

The waveform of CGB in the time domain cannot be 
obtained due to its stochastic nature, making the sepa- 
ration of CGB from the time series data very difficult. It 
is natural to extract the CGBs in the frequency domain 
since the function of the power spectra can be written an- 
alytically. Nevertheless, the method to remove the drift 
in the frequency domain is unknown. In this paper, the 
function of the power spectrum of the drift will be de- 
rived, and a method to remove the trend in the frequency 
domain for LISA data analysis will be developed. 

In section [Hi we will review the time series of LISA 
instrumental noise, and demonstrate the approach con- 
ducted in the time domain to obtain the LISA sensitivity. 
In section we will find a new representation of Fourier 
power spectrum, and use the representation to derive the 
power spectrum of the drift trend. In section IIVI the 
derivation of probability distribution of noise power will 
be provided. In section |Vl the algorithm of parameter 
estimation will be introduced, followed by the result of 
data analysis. Finally, in section I VII conclusion will be 
given. 
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FIG. 1: The simulation of displacement shift of proof mass 
due to random acceleration. The black curve is simulation 
shifts. The red line (dash) is the best fit given by a linear 
function. The green curve (dash-dot) is the best fit given by 
a quadratic function. The blue curve (dash-double-dot) is the 
best fit given by a third-power function. 
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FIG. 2: The black line is the displacement error introduced 
by the acceleration noise in the frequency domain. The red 
dash line is the residual of linear fitting in the frequency do- 
main. The green line is the residual of quadratic fitting in 
the frequency domain. The blue line is the residual of cubic 
fitting in the frequency domain. 



II. REMOVE DISPLACEMENT CAUSED BY 
LISA RANDOM ACCELERATION IN THE TIME 
DOMAIN 

Energetic particles keep hitting the proof mass, pro- 
ducing random accelerations on it continuously. The per- 
turbations from random accelerations accumulate, and 
gradually depart the proof mass from its free-falling tra- 
jectory 0. The LISA sensitivity curve cannot be ob- 
tained just by directly Fourier transforming the drift in- 
duced from such accelerations. The drift must be fitted 
by a quadratic function and the best fit must be removed. 
Then the Fourier transformed residuals can represent the 
LISA noise level. In this section we will demonstrate how 
LISA sensitivity curve is obtained from fitting in the time 
domain. Firstly we will simulate the drift induced from 
the acceleration, and then fit the simulated data. We uti- 
lized Gaussian distribution to simulate the shifts ^. The 
acceleration noise power spectral density is suggested as 
9.0 X lO^'^" m? /s^Hz Dividing it by sampling rate dt 
and then taking square root, we will obtain the average 
of acceleration. Using this average as the standard devia- 
tions, we can draw a time series of random accelerations. 
Having double integrated the random accelerations, we 
will have the time series drift. Here the sampling rate dt 
is set as 1.5 second, and 8192 data are simulated. The 
simulation of drift is shown by the black curve in Fig. [1] 

Dividing the drift by arm-length the dimensionless 
drift can be obtained. Having Fourier transformed the 
dimensionless drift, we will get the strain amplitude, as 
shown by the solid line in Fig. [S] As shown in the Fig. [2l 
the frequency dependence of the strain is approximately 
1 // rather than 1//^ as indicated by the sensitivity curve 
in the LISA Prephase A study. In order to recover the 
LISA sensitivity the trend of drift must be removed. 

Since the drift is induced from random acceleration. 



it is natural to model its trend by a quadratic function 
at^ -f 6t + c where a, b, c are unknown parameters. Using 
the function to fit the drift, and then remove the trend, 
which is identified as green dash-dot curve in the Fig. [1] 
from the drift, the residual can be obtained. By Fourier 
transforming the residual, the true noise level can be re- 
covered. The strain amplitude of the residual is shown by 
the green dot curve in Fig. [51 Its frequency dependance 
is proportional to /^^ as shown in the LISA Prephase A 
study 

In addition to the quadratic equation, the linear equa- 
tion at + b and the cubic equation at^ + bt^ + ct + d are 
tested to remove the trend as well. The best fits given by 
the linear and cubic equation are indicated by red and 
blue line, respectively. From Fig. [2] it is noticed that 
the amplitude of the residual given by the fitting with 
linear function is lower than the uncleaned displacement 
by one order of magnitude, but it is inversely propor- 
tional to the frequency /, not to The amplitude of 
the residual given by the fitting with cubic function is 
still proportional to /^^, and is as the same level of the 
amplitude corresponding to quadratic fitting. One more 
parameter used in the cubic fitting does not provide extra 
benefit. 

Here we demonstrated that the LISA sensitivity curve 
is obtained by removing the trend of the drift. There is no 
problem if only point sources involve in data since point 
sources can be analyzed simultaneously with removing 
the trend in the time domain. However, if data contain 
the waves produced by stochastic sources, such as astro- 
physical foreground or cosmological background, which 
can only be analyzed in the frequency domain, removing 
the trend beforehand in the time domain would induce a 
bias in the analysis of the sources. A technique to deal 
with the trend in the frequency domain is necessary. 



III. EXPECTED POWER OF INSTRUMENTAL 
NOISE 

In last section the way to remove the trend of the drift 
in the time domain was described. In this section, a 
method to remove the trend directly in the frequency 
domain will be shown. 

Suppose ft,fc, fc : iV — 1 is the time series of data, and 
Hn is the Fourier transform of the data, the power 
is given by 

lH,i' = h„xh; (1) 

where k is index, and N is total number of data. The 
summation is usually calculated firstly over one index 
and then the other, or vice versa. This is implied by the 
functionality of summation. However, what does matter 
is summing the term over all k and k' on the N x N 
grid. The implied procedure is not the only approach 
to carry out the calculation. We calculate the summa- 
tion along the diagonal arrays as shown in the Figure [3] 
rather than the regular procedure. The terms along red 
lines have the property that the difference of k and k' 
is fixed. Thus we introduce an index t to indicate their 
difference k — k'. k equals k' on the diagonal line, so their 
phase is cancelled. The term hkhk' exp{2Trin{k — k')/N} 
turns to be Considering the arrays corresponding 
to t = 1 and t = —1, the terms along those diagonal 
arrays have symmetric mathematical expression. Their 
phases, 27Tn/N and —2i:n/N , have same magnitude but 
opposite sign. Because of that, the sum of those two 
is 2hkhk+i cos{2Trn / N) where k : ^ N — 2. Similarly 
the other symmetric terms with ±t can be combined to 
2hkhk+t cos{27rnt/N) where k : ^ N —1 — t. Therefore, 
we can rewrite Eq. ^ as the following form 

JV-l N-lN-l-t 

IH^I'^Y.'^I + ^II E h,h,^,cos^. (3) 

The benefit of new representation is that the time se- 
ries data can be separated into autocorrelation terms 
and cross-correlation terms hkhk+t- When we deal with 
the Fourier component of random noise, the autocorre- 
lation term will remain and cross-correlation term will 
vanish as their ensemble average is taken. The ensemble 
average is associated with the standard deviation of the 
time series random noise. As a conclusion, with this rep- 
resentation, the power spectrum of random noise can be 
expressed by some statistical properties of the time series 
data. 



A. Shot Noise 

Time series of shot noise can be characterised as inde- 
pendent Gaussian noise. This implies that their ensemble 




FIG. 3: Illustration of the summation of Fourier power spec- 
trum. 

average is zero, and they are not correlated, which can be 
expressed by {hkhk+t) — for any t ^ 0. The standard 
deviation of Gaussian distribution should equal to the 
ensemble average of square of data {hf.) if data length is 
infinite long. Knowing this we can apply Eq. to find 
a formula for the Fourier component of shot noise. Sub- 
stituting time series shot noise {hk} into Eq. ^ we will 
obtain 

N-l 

2 N~lN-l-t ^^^^ 

+ nT. (hkhk+t) COS ^ (4) 

(5) 

It is not surprised that the formula for describing the 
power spectrum is a constant since shot noise is white 
noise. 

B. Acceleration Noise 

To find the expression for the power spectrum of accel- 
eration noise, we review the characteristics of time series 
drift induced from random acceleration {ai} at first. The 
gross feature of {ai} can be recognised as independent 
Gaussian noise as well 

P,„.|/,.-^e.p{-iJ}. (6) 

First of all, the ensemble average (a^) is zero. Secondly, 
the random acceleration noise {a.;} is independent, so 
we have (aiflj) Vi ^ j. Thirdly, the ensemble average 
of square of (a^) equals to where a is the standard 
deviation of the distribution. 

Now we construct the Fourier component of the dis- 
placement noise step by step. We begin this work by es- 
tablishing the Fourier component of velocity noise. Sup- 
pose Vk is the time series of velocity noise associated with 
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random acceleration a and its initial value wq is zero. We 
assume that it obeys the equation 



Vk = Vk~i + afcA. 
With the initial condition we can derive 
Vk = (fli +a2 + ... + afc)A, 

and 

Vk+t = (ai + 02 + ... + ttk+t)^. 
Then, we know 

fc k 



vl 



A^(E«0(E«.) 

i=i j=i 



and 



VkVk+t 



(7) 
(8) 
(9) 

(10) 
(11) 



fc fc+t 

A^(E«0(E«.) (12) 

4=1 J=l 

/e k-\-t 

= A2(5:aO(E«. + E (13) 

4=1 J = l J=fc+1 

= A'[E«? + 2^ E 

-i— 1 1—1 
k /c+i 

+ E E (14) 

i=l j=/c+l 

Substituting Eq. ([TT]) and ([H]) into Eq. (O, we can ex- 
pand the power of velocity noise \Vn\'^ as 



2 A^-l k 



A2 



^"i^ = ^EE«? + 2i^EEE«^«. 



fe=i 1=1 



„ . , . , 2TTnt 

2— > > cos— — 



E E 

t=i fc=i 

k k 



k=l 2=1 j=i+l 
fc 

X 

1=1 



[E«.^ 



A; fc+t 



2 E E '^^"j + E E ^'^1 



(15) 



2=1 J = 2+l 



4=1 j = fc+l 



and all other terms are zero. The term ^costa; and 
t cos tx are given as 



^ sin^ ,N+1 
> cos tx = — ^ 

z—^ sin i 



cos(— — x), (17) 



t=i 



and 



N-l 



^ Nsin^^^x 1-cosiVx , , 
E^c°«*^=^T— I . . 2. ■ (18) 



2 sin f 4 sin^ f 



Substituting into x in Eq. (IT71) . the right hand side 
turns to zero. The left hand side of Eq. pT)) is 



2Trnt f 27rni 



27rn 

^ cos — — — h cos 27rn + cos — - . (19) 

t=i t=i 



With the information on the both sides of Eq. (ITTl) , we 
obtain 



E"'^ ^ 27rni 27rn 
cos = — 1 — cos ■ 



t=i 



N 



(20) 



Substituting 2^ into x in Eq. , it is then simplified 



as 



JV-l 



TV — 2 

E27rnt , , 27rn v-^ 27rnt 

icos^ = (^-l)cos— 



TV 

TV sin ^ _ N 
2 sin ^ " ^T' 



(21) 



Thus, we acquire 



E2TTnt N . ^ , 27m , ^, 

*cos^ = ---(iV-l)cos— . (22) 



TV 



iV 



For the expression of ^t^coste, it can be derived 
by differentiating ^tsinte with x. The close form of 
^ t sin tx is 

sinA^a; A^cos^^a; ^ ^ 
Y^tsmtx^^-^ ^— I— • (23) 



4 sin 



2 sin ■ 



Then we can calculate the expected power of velocity 
noise by taking ensemble average of |V„P 



Differentiating it with respect to x on the both sides, we 
get 



A 



A2 



N-l-t fc 



2Trnt \ ^ \ ^ 

, , , cos —rr- > > ( 

N ^ N ^ ^ 

t=l fc=l 4 = 1 



N-l 



fc=l 4 = 1 

A^g^ N{N-l) A^a"^ 
^V 2 ^ iV 

Af-2 

^ [<2 - (2iV - l)t + iV(iV - 1)] cos 



^ „ iV(2iV - 1) sin 2i^a; 

2^ t^ COS ia; = — 2 



4 sin I 



iV[| cosiVa; + i cos(iV - l)a 



4sin2f 



2Tmt 
N 



(16) 



sin Nx X 

H — cos — . 

4sin3§ 2 



(24) 



Substituting 27rn/iV into x, it gives 



cos ■ 



2Trnt 

w 



(iV- 1)2 cos — 



N-2 



cos ■ 



t=l 



N 



(25) 



iV(2A^-l) iV[f + icos2fi 



4sin^^ 



2 sin^ ^ 2 



(26) 



Moving the first term in Eq. (^5)) to Eq. (pS)). we obtain 
the expression 



iV-2 

cos 



27rnt , ,o 27m 
^ = -(A^-l)2cos— + 



2 sin^ ^ 2 



(27) 

Substituting Eq. Eq. ([22]), and Eq. ^ into Eq. 

(|16p . we have a compact form for the power of velocity 
noise 



(IK 



2sin2^ 



(28) 



Next, we consider the displacement noise of proof mass 
due to random acceleration noise. Suppose {xi\ is the 
time series of position noise with the initial condition of 
xq = 0. From Newtonian dynamics, the time series can 
be described by the following regression relationship 



a = 3.00e-15(ni/s"2Hz"l/2) 
a = 9.75e-16 (n!/s»2 Hz»l/2) 
a=5.18e-15(m/s"2Hz"l/2) 
a= 1.72e-15 (m/s''2 Hz» 1/2) 
a = 2.69e-15 imls"! Hz"l/2) 
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FIG. 4: Deconvolution of acceleration noise. The black solid 
curve is given by Eq. (f5i)l with a = 3.00 x 10~^^m/s^, which 
is the value of the standard deviation used to generated time 
series acceleration noise. The other four curves denote four 
realisations of the deconvolution of acceleration noise, respec- 
tively. They have different estimated value of acceleration as 
shown in graphs. 



i i+t 

XiX^+t = [A2^(i- j + l)aj][A2^(i + i-/c + l)aj] 

i=i k=i 

i i+t 

= 3 + l){i + t-k + l)ajak. (33) 

j=i k=i 

Substituting Eq. ^ and Eq. ^ into Eq. ©, then we 
can obtain the expected power of the drift 



.Ti = + w^A V« : 1 iV - 1. (29) 

From Eq. (|29p and the initial condition we know 

Xi = + UiA = a;i_2 + (t^-i-i + Wi)A 

= (wi + --- + w,)A. (30) 

By Eq. (|8]) the velocity can be calculated from the ran- 
dom acceleration noise 



Xi = [al + {al+a2)^ h^a^JA^ 

fc=i 

■i 

= A2^(i- fc + l)afc. (31) 

k=l 

From Eq. (|3T|) we can compute xf and XiXi+t 

i i 

x} = [A2^(^-J-^-l)a,.][A2^(^-fc + l)a; 

3 = 1 k=l 

(32) 



i j~l 

+ 2^^(i-j + l)(f-fc + l)ajafc 

j=2 fc=l 



N-1 



N-1 N-l-t 



{\Xn 



^E(^?) + §E E {x^X^+t)cOS 



2'Knt 



AV 

N 



t=l i=0 



N 



2A 



JV-2 



EE(-^- + i)^ + ^E 



cos ■ 



t=l 



i=l j=l 

N-l-t i i+t 

^ E EE(*~-?' + i)(*+^-^+i)'5jfe 



2'Knt 



i=l j=l k=l 
A^a^N{N^ - 1) A^a^ — ' 



12 



6iV 



E 

t=i 



cos ■ 



2nnt 



X [-t"* + 2Nt^ + - 2NH + iV2(iV2 - 1)] . (34) 

We now have the expressions of X^t^^os^^ and 
cos 2^ but we still need those of Y.t cos ^ 
and X^t^^'cos^ to complete Eq. They can be 

obtained by differentiating ^ji^sinte and '^^t^ costx 
with respect to x once and twice, respectively, and then 
substituting 27m /A^ for x. Then we will get 



^ r, 2'Knt 
r cos — — 
iV 



~ + 4sin2^' 



(35) 



6 



and 

N-l 



t=i 



271111 

N 



N(N^ + 1) 



3N 



(36) 



Substituting Eq. ([181), Eq. ([271), Eq. ([35]), and Eq. §6 
into Eq. ([Ml), we finally obtain 



1 



1 



sin*^ Ssin^^ 



N 



N 



(37) 



There are two terms in Eq. (1371) involved in the power 
of drifts. One is inversely proportional to sin** the 



N ' 



other is sin^ If n is small, we can approximate sin'* ^^ 



N 

and sin^ ^ by and , respectively. The strain ampli- 
tude corresponding to these two terms can be obtained 
by taking square root. There is not only 1//^ term, but 
also 1// term. Moreover, the strain amplitude of drifts 
is dominant by 1// term especially in the high frequency 
range. As we have done in the analysis in the time do- 
main, in order to reveal the true noise level, the power 
of drift must be removed in the frequency domain. Eq. 
([57)1 can be employed to estimate the drift power. 



IV. PROBABILITY DISTRIBUTION OF NOISE 
POWER 

The cosmic gravitational-wave background entangles 
with noise in time domain, forbidding the analysis per- 
formed. Hence, we intend to analyze a data set in the 
frequency domain through its power spectrum. The mea- 
surement d{t) is the sum of signal s{t) and Gaussian ran- 
dom noise n(t) where the Gaussian noise is drawn from 
the distribution 



P{n) = 



1 



27rCT 



exp 



{-&} 



(38) 



1. Fourier Amplitude of Noise 

Giving a time series of noise {uj} drawn from Eq. 
its Fourier amplitude is given by 

N-l 



(39) 



= Re{no + Wni + --- + W^'^'^-'^'^nN-?j 

+ i Im(no + Wni + ■■■ + W'^^^'^^nN^i'j (40) 



N-l 

j=0 3=0 



ill. + *^fc 



(41) 
(42) 



where N is the number of noise data, and W = 
exp{27ri/A^}. and h]. are the real part and imagi- 
nary part of fik, and Rkj and Ikj denote the real part 
and imaginary part of W^^ , respectively. 



2. Probability of and h\. 

From the probability distribution of time series instru- 
mental noise given by Eq. (1381) , we can derive the proba- 
bility distribution of hi by marginalizing the probability 
distribution Pifi^, rig, rii, • • • , njv-i|-^) over no • • • riN-i'- 



'■k) ^ I ' ' I dnodni ■ ■ ■ driN-i 

X P{hl,no,ni,- ■ ■ ,nN-i\ I) (43) 

where P(nJJ, riQ, rii, ■ ■ • ,njv-i|/) is the probabil- 
ity distribution of n^, riQ, • • • ,npf^i being true 
given the background information /. From Baye's 
theorem, P(r7,^, np, ni, • • • , njv_i|/) can be de- 
composed into the product of likelihood function 
P{hl.\ no,ni, • • • ,njv_i) which is the probability distri- 
bution of for a given {no,ni, • • • ,nN-i}, and prior 
function P{n(),ni,--- ,nN^i) 



-P(^fc) = /"'/ dno- ■ ■ dnN-iP{no,- ■ ■ ,nN-i) 

J J —OO 

X P{fil\ no,-- - ,nN-i) (44) 



Since njl is completely determined by Eq. (|42|) if no, 
ni, - - - , n^v-i are known, P(n5J| no, ni, - - - , njv-i) must 
satisfy 



N-l 



P{nl\ no,ni, - - - ,njv-i) ^s(hl- ^ Rk]nj^ (45) 

j=o 

where d{x) is the delta function of x. Moreover, if 
no, ni, n^v-i are independent random variables, 
P(no, ni, - - - , n^v-i) can be written by 

P(no, ni, - - - , riN-i) - P(no)P(ni) - - - P(njv-i). (46) 

Therefore, Eq. (|44)) can be rewritten as 

■P(^fc) = / ■■■/ (5f n^ - ^ Pfc^rij - nojP(no)dno 



X P(ni) - - - P{npf^i) dni ■ ■ ■ dnpf^i. 



(47) 



Since the delta function is involved, the integration 
over no produces P(no) directly where no equals nJJ — 
12^=^1^ ^kj^ij. Since P(no) is a Gaussian distribution, we 
will have 



7 



2TTa 



n 

1=1 



P(ni)exp-\ }dni---dnN-i. 



Combining P{ni) with the exponential term, it becomes 



(48) 



Pinl) 



dn2 ■ ■ ■ duN-i P{ 



ni 



1=2 



dni exp 
I 



^kjnj — Rkni) 



(49) 



where P{nj) is given by Eq. psp for any integer j. De- 
noting K as the integral term over ni in Eq. (|49p . and a 



as fi^ — X]^2^ RkjiT'j-, the integral can be simplified as 



K = 



/dni exp < — 
-oo 



2a2 



27rcr2 



■ exp 



(f + Rl)nl - 2aRkni+a^ 
2, 



{ 2(l + Rl)a^}- 



Substituting Eq. ([SOI) into Eq. (gS]) we have 
f 



(50) 



P{hl) 



^2n{l + Ri)a^ 



dn:) ■ ■ ■ dn 



N-1 



2(l + i?D^^ 



n i^(rii)exp{ 



!=2 



-}• (51) 



We can integrate over from n2 to un-i with the same 
process used in integrating ni. Then P(np can be writ- 
ten as follows 



~r 2 



27ra2 



: exp 



I 2^2 / 



(52) 



where cr|r denotes cr^ S jlo^ ^fcj ' With the same steps 
P{n],) can be found as 



1 



27ra2 

n 

W-l 7-2 



exp 



(53) 



where cr?i denotes cr^ ^^.^^ /2^._ 



distribution of P{Pk) from Eq. (l52|) and Eq. This 
can be done by a series of changing variable. At the first 
place the variables of the probability distributions were 
changed from n|J and n\, to Par and P^i , respectively, 
and then changed from P^^r and P^^,^ to P^. Although 
the variables were changed, the integral of probability 
over entire region should be the same (and equal to one) . 
Therefore, it is known that 

/•oo /-oo 

/ P{Pai) dPai = / P(nD dni 

Jo J-oo 

POO 

= 2 / P{nl) dni. (54) 

The range of P^^ is from zero to infinity since P^r is 
positive. The second line is obtained from the symmetry 
of P(nl) about zero. From Eq. (|54|) we know 



P{PaO = mnl) 



dhl 



dPfr 



(55) 



where 



is Jacobian. Since fi^, = y^Phi , the Jaco- 
bian is ^P^r^^^- Substituting Eq. (|52|) and the Jacobian 



into Eq. (|55|) we have 
1 



1 



: exp 



{-St} 



With the same steps we can derive P(Pfji ) 



(56) 



PiPf.O 



1 



1 



p. 



: exp 



(57) 



3. Probability distribution of P^r and P^^i 



The total power contained in the kth Fourier compo- 
nent is Pk = Phi + Phi = "fe ^ + ^fc ^ where P^i = fi^ ^ 
h\ 2 . Now our goal is to derive the probability 



4- Prohahility Distribution of Pk 

Now we would like to know the probability distribution 
of Pk for given Ph^ and Pfji . From marginalisation we 



8 



know that 



P{Pk\I) - 11^^ dP^^dPf,. P{P,,P,~,.,P,, 



dPf^rdP,,. P{Pk\ Pnl,Pnl,I) 



X P(Pf..;,Fail /) 



(58) 



where the second hne is given by Baye's theorem. Since 
Pk is the sum of P^r and Pf^i^, P{Pk\ Phi,PniJ) is a 
delta function of Ph — Pf.^ — Pt.^ ■ Because Par- is inde- 
pendent of Pf^i , P{Pn^^ , Pn^ I I) can be decomposed as 



J 



P{Pk) - 



'^'^^nlOni Jo 



Pk 



dP~, 



: exp 



where the upper bound of Pn^ is subject by Pk- Mov- 
ing the exponential term involving with P^ out of the 



J 



P{Pk) = 



1 



■ exp 



Pk 



P{P^^J X Substituting P{Pk\ Pni,Pf,.^,I) and 

P{Pni, PfiiJ I) into Eq. ((58| . and then integrating over 
P^i , Eq. dlHl) is led to 



PiPk) ^ J J 5{Pk-Pni-PnO 

xP{P^rjP{Pf,.J dP^rdP^,^. (59) 

Integrating the delta function over P^^i will give 
PiP^r )P{P^ - P^A, resulting in 



Pf.r 



exp 



integral, then we can obtain 



dPf, 



PndPk-PAl) 



: exp 



^2 ^2 



-Pnl 



(60) 



(61) 



Expanding cr|r and cr?; Eq. (|5T|) can be simplified: 



this approximation we can simplify Eq. (j6ip as 



w-i 



JV-l 



cos 



TV 



J=0 j=0 
.2 



2-Kkj 

N 



cos ■ 



(62) 



exp 



1 r 



1 



r ^1 



exp 



1 



Pn;; (Pfc Pftr. ; 



7r/2 



2d6l 



(64) 



and 



ah 



N-l 



N-l 



sm 



iV , cr^ ^ 



271 kj 
N 



N 



2 



cos ■ 



J=0 



Ankj 



Since the order magnitude of J2f=o^ cos is 1, cr|r and 
cr?i can be approximated as Na^/2 if N is large. With 



(63) 



where the variable Pn^ is changed as Pk sin^ 9. Since 
Na'^ is the noise power P„, Eq. ((64)) can be rewritten in 
another form: 



P(P,) = -1 exp {-^}. 



(65) 



Eq. (j65p is the probability distribution of power in the kth 
Fourier component which we will use in data analysis. 

There are several features on P{pk). First, it is nor- 
malised: 



P(Pfe) dPk^^ f dPk exp { - ^} = 1. (66) 



P 



n Jo 



Second, it is not Gaussian. Its maximum is at Pk 
but its expectation value is the noise power P„: 

= / exp I - ^ I dPfc 



0, 



(67) 



The second hue is given by integration by part. On the 
other hand, if we calculate {n*fjhi^) directly from Eq. ([3]), 
it is found that 



3=0 j=Q 



N-1 



N-1 



2 / 2\ 



J=0 
N-1 



J=0 



(68) 



where the second line is obtained because it is assumed 
that Tii and rij are not correlated for any i and j. This 
result is consistent with Eq. ([ST]). Third, its uncertainty 
is P„ as well: 



where 



(Pi) 



4. 



{Pi) 



(69) 



2 / p^expf-:^} dPk 

Jo ^ ' 

2F„ / exp { - ^ I dPk 

Jo ^ ^ri J 



dPk 



2Rf. 



(70) 



Substituting Eq. ^ and Eq. ^ into Eq. ^ we have 
CTp^ = P^. Fourth, the likelihood of Pk located within 

Pr, ± P„ is 



2P„ 



P(Pfc) dPfc = 1 



86.47%. 



(71) 



V. DATA ANALYSIS 

The parameter estimation algorithm is built upon the 
Bayesian statistics. Q Baye's theorem allows us to de- 
compose the probability density function of hypothesis 
into likelihood function and prior, which are easier to 
assign. The main challenge of the algorithm is to com- 
plete estimation within a reasonable time. For instance. 



le-22 




I le-28 



le-30- 



16-32^ 



0.0001 



0.001 0.01 
Frequency (Hz) 



FIG. 5: The deconvolution of residuals of three realisa- 
tions of acceleration noise after de-trending are displayed 
in different curves. The estimated acceleration of black, 
red, and green curves are 9.75 x 10~^^, 5.18 x 10~^^, and 
1.72 X 10~^^ m/ H z^^^ , respectively. The residuals were 
converted from the observation time of 12288 sec to 1-yr ob- 
servation. 



even though just ten values were tried for each param- 
eter, 10^° trial parameter sets are necessary to find the 
best estimate if the model contains 10 parameters. To 
solve this, Markov Chain Monte Carlo (MCMC) method 
was adapted. Furthermore, the simulated annealing 
is applied to speed up the search of the Markov 
Chain. With the Markov Chain Monte Carlo method, 
we can sample parameters from the likelihood function. 
If the number of samples is plenty, the distribution of the 
parameter shall be close to the posterior. Then the value 
of parameters, the uncertainties, and the correlations can 
be estimated directly from the samples. 

The acceptance of the candidate state is condi- 
tional on its relative probability to the current state 
P(x("+i) \data, /)/P(x(") \data, I). With Baye's theorem, 
we can expand P(x("+^^ |c?ota, /) and P(x'^"^ |daia, /). 
Since the same model and the same data set are used, 
the prior and the evidence will be cancelled out. The rel- 
ative probability is then reduced to their likelihood ratio 



P(x("+i)|dato,/) _ P(data|x("+i),/) 
P(x(")|data,/) ^ P(data|x("),/) 



(72) 



Substituting Eq. ^ into Eq. ^ the likelihood ratio 
can be expressed as 

P(x("+i)Mato,/) _ cxp{-^(pf-pf(x("+i)))} 



P(x(")|(iaia,/) 



exp 



{-^(pf-pf(x(")))} 



(73) 

where pf is the power in the zth data, and (x^"+^)) and 
p|(x("^) are the signal power given by the state x'^"+^' 
and x*^"^ , respectively. With this relative probability we 
can compute the transition probability 



A((x("+i)),(x("))) =mm 



P(x("+i)|dato,/) \ 
' P(x(")|(iota,/) j ^ ' 



to determine whether the candidate state should be ac- 
cepted. 

There are two concerns about the convergence of chain. 
One is that the chain may converge very slow. The other 
is that the chain may not converge as the sampling is 
terminated. To deal with the first one, the simulated an- 
nealing is applied. This is a dynamic way to change the 
moves between the samples, ft encourages bold moves 
in the beginning of the sampling to prevent the chain 
from getting stuck in local minimums. After this burn- 
in phase, the moves will be adapted to be conservative 
to speed up the sampling. For the second one, a diag- 
nostic to monitor the convergence of chain is necessary. 
According to the way of monitoring, the diagnostics were 
classified as qualitative (graphical) or quantitative. Some 
of them generate a very long chain to do monitoring, 
and the others generate multiple shorter chains to pro- 
ceed. The Gelman & Rubin's method f?\ is used as a 
diagnostic of the convergence of chain. It is used in var- 
ious fields, such as Cosmic Microwave Background data 
analysis \S\. Gelman & Rubin's method observes the con- 
vergence with multiple-chain approach. The idea is that 
with-in chain variance and between-chain variance shall 
be very close if the chains are converged; otherwise, the 
between-chain variance shall be larger than with-in chain 
variance. Cowles and Carlin have reviewed various diag- 
nostics of convergence @ . 

Combining the Bayes's theorem, the Markov Chain 
Monte Carlo method, the simulated annealing method, 
and the Gelman & Rubin's method, the detail process 
is as follows. First, draw number of starting points of 
chains from a uniform distribution. The required num- 
ber of chains is 10 times the number of parameters in the 
model. Second, apply simulated annealing to generate 
candidate states. Third, use Eq. (|55|) to calculate the 
likelihood ratio between two states as transition proba- 
bility P. If P is larger than 1, then the candidate state is 
accepted. If P is smaller than 1, we generate a random 
number r from a uniform distribution for < r < 1. If 
r < P, then the candidate state is still accepted; other- 
wise, the candidate state is rejected, and we generate a 
new one to repeat the process. Fourth, use the Gelman & 
Rubin's method to calculate the factor R to monitor the 
convergence. If i? > 1.01, this indicates that the chains 
may not be converged, and more samples are required. If 
R < 1.01, it is suggested that the chains are converged, 
and we use the second half of the chains to estimate the 
expectation value 



1 ^ 



(75) 



uncertainties 



N 



N 



1 ^ 

n=l 



N 



TV- 1 



(76) 
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FIG. 6: The black curve is the strain of the sum of shot 
noise and drift trend induced from the random acceleration. 
The green dash curve represents the estimation of the drift 
trend. The bottom red curve is the residuals after removal, 
regarding as the instrumental noise. The observation time is 
12288 sec. 



and correlations 



cor (a;*, x-') 



(77) 



of parameter set {x*} where i = 1, 2, • • • , m and N is the 
length of the second half of the chains i\ 

We simulated 24 realisations of deconvolution of the 
random acceleration. The deconvolution is given by 



ha = 



Rif) 



(78) 



where Pa{f) is the power of random acceleration given 
by Eq. and i?(/) is the LISA transfer function. The 
value of the standard deviation used in the Gaussian 
distribution for generating acceleration noise is 3.00 x 
10^^^ m/s^V Hz. Fig. 0] shows four of realisations. The 
black solid curve indicates the deconvolution ha with this 
magnitude. As revealed in the figure, all realisations have 
different amplitudes but share the same pattern. The 
amplitude of acceleration noise spectrum is proportional 
to the final trajectory discrepancy of proof-mass from its 
free-fall track. This suggests that the coefficient a in Eq. 
([57| is the 'average' acceleration corresponding to each 
realisation, and it inherits the random nature of time se- 
ries acceleration noise. The removal of acceleration noise 
aims to find a value for a to provide the best description 
of data with Eq. (1571) . 

The magnitude of a is estimated through the param- 
eter estimation algorithm. The samples of a are drawn 
from Eq. (j74p where is given by 



4^2 



1 



iV2 



1 



N 



3sin^ 



N 



and noise power Pn{f) is given by 



12288L2i?(/) 



(79) 



(80) 
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where L is the arm-length 5 x 10^ m, 12288 is our obser- 
vation time in second, R{f) is the LISA transfer function, 
Ss is the shot noise spectral density 1.04 x 10^^^ rn^ / Hz, 
and Sa is the acceleration noise spectral density 9 x 
10"'^'^ TV? I Hz. Although the best estimates of a are 
all different for each realisation, the noise levels are the 
same as Fig. [S] shows. 

The raw data of instrumental noise, as presented by 
black curve in Fig. [HJ is sum of drift trend caused by 
random acceleration and shot noise. The estimation of 
the drift trend is displayed by the green dash curve in 
Fig. [HI The residual after the removal is shown by the 
red curve in Fig. [6] In the frequency range below 1 mHz 
where acceleration noise is dominant, the residuals de- 
crease as 1//^ as expected from acceleration noise. In 
the frequency range above 10 mHz where shot noise is 
stronger, the residuals roughly increases as /. Between 
1 ~ 10 mHz where shot noise and acceleration noise 
are comparable, the lowest region of the noise is around 
5-10 mHz. 



VI. CONCLUSION 

In Sec. |TT] we have illustrated the removal of the drift 
trend of LISA proof mass in the time domain. By using 
a quadratic function to fit the data, the trend can be 
estimated and be removed from the data. Converting 
the time span of the data to one year observation, the 
frequency dependance and magnitude of the cleaned data 
match the low frequency part of LISA sensitivity curve. 
In addition, a cubit function was used to fit the trend as 
well, but it did not give a better fitting, indicating that 
the the instrumental noise in the low frequency band as 
shown in the LISA sensitivity curve is intrinsic. 

Since the cosmological sources are randomly dis- 
tributed across the sky, the emitted gravitational waves 
will form a continuum and be mixed with the LISA in- 
strumental noise in the time domain. In order to sepa- 
rate the background from data, the power spectrum of 
the drift trend of the proof mass shall be found in the 
frequency domain. To achieve this, firstly we have re- 
formulated the Fourier power spectrum in Sec. IIIIl With 
this new representation, the product of time series data 
niUj in the Fourier transform is separated into autocor- 
relation part nf and cross-correlation part niUj where 
i ^ j. The advantage of this representation is that the 
ensemble average of the cross-correlation part will vanish 



when we deal with purely noisy data. 

Next, we have applied the new representation to derive 
the expected power spectrum of the drift. It is found that 
the strain amplitude of shot noise is white, and that of 
velocity noise induced from random acceleration follows 
^ 1// as expected. As for the amplitude of displace- 
ment noise caused from the acceleration, it is thought to 
be proportional to inverse square of frequency, resulting 
from integrating time by part twice. However, we real- 
ized that it depends not only on a term associated with 
— 1//^, but also on a term proportional to 1//. More- 
over, the 1// term is the dominant component. 

It is known that the time series data of the drift is sub- 
ject to Gaussian noise, but we cannot sure that its coun- 
terpart in the frequency domain is subject to Gaussian 
noise as well. In Sec. IIVI we have derived the probability 
distribution which the power of the drift in the frequency 
domain is subject. The probability distribution is expo- 
nential. Some characters of the distribution, such as the 
mean and uncertainty, have been given as well. 

In Sec. |V] the algorithm for the data analysis has been 
described. The algorithm was built upon the Bayesian 
statistics, which was collaborated with a Markov Chain 
Monte Carlo method to enhance the efficiency of the anal- 
ysis. Simulated annealing was employed to encourage 
Markov Chains to explore entire parameter space. The 
Gelman & Rubin method (1992) was chosen as a diag- 
nostic for the convergence of the chains to confirm all 
statistical results being reliable. 

We have employed the algorithm to analyze 24 realiza- 
tions of drift trend lasting 12288 sec. It is found that the 
frequency dependence of the strain given by the acceler- 
ation noise is ^ 1//^. We convert our results to a data 
se t of 1-year observation t ime by multiplying a factor of 
v/l2288sec/l — yr in sec, giving that the strain ampli- 
tude induced from acceleration noise is around 5 x 10~^* 
at 1 mHz, which agrees with the acceleration noise in 
the figure 4.3 in [l|. 

In this paper we have demonstrated an approach to 
clean the power of the drift induced from the random 
accelerations on LISA proof mass in the frequency do- 
main. The approach can be applied to other space-borne 
interferometers as well if charges on their proof masses 
cannot be perfectly cancelled. We have shown that the 
LISA sensitivity can be recovered with this approach. 
This approach allows us to construct a more complicated 
algorithm to detect stochastic gravitational-wave back- 
ground in the LISA data stream. 
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